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ABSTRACT. We give a self-contained proof of the Ai conjecture, which claims that the norm of any 
Calderon-Zygmund operator is bounded by the first degree of the A 2 norm of the weight. The original 
proof of this result by the first author relied on a subtle and rather difficult reduction to a testing condi- 
tion by the last three authors. Here we replace this reduction by a new weighted norm bound for dyadic 
shifts — linear in the Ai norm of the weight and quadratic in the complexity of the shift — , which 
is based on a new quantitative two-weight inequality for the shifts. These sharp one- and two-weight 
bounds for dyadic shifts are the main new results of this paper. They are obtained by rethinking the 
corresponding previous results of Lacey-Petermichl-Reguera and Nazarov-Treil-Volberg. To com- 
plete the proof of the Ao conjecture, we also provide a simple variant of the representation, already in 
the original proof, of an arbitrary Calderon-Zygmund operator as an average of random dyadic shifts 
and random dyadic paraproducts. This method of the representation amounts to the refinement of the 
techniques from nonhomogeneous Harmonic Analysis. 



1. INTRODUCTION 

A Calderon-Zygmund operator in W 1 is an integral operator, bounded in L 2 and with kernel K 
satisfying the following growth and smoothness conditions 

(i) \K(x,y)\ < , Ccz , , for all x,y £ W l , x^y. 

\x — y\ d 

(ii) There exists a > such that 

\K(x,y)-K(x',y)\ + \K(y,x) - K(y,x')\ < C cz ^~ X J" a 

\ x y\ 

for allx,x / ,;y <E R d such that \x-x f \ < \x—y\/2. 
It is well known that a Calderon-Zygmund operator is bounded in the weighted space L 2 (w) if 
(and for many Calderon-Zygmund operators only if) the weight w satisfies the famous Muckenhoupt 
A2 condition 

(i.i) sup (jer 1 ^^) (\q\- 1 jw- i d^j =■. [w\ M <oo. 

The quantity [w] A is called the Muckenhoupt norm of the weight w (although it is definitely not a 
norm). 

It has been an old problem to describe how the norm of a Calderon-Zygmund operator in the 
weighted space L 2 (w) depends on the Muckenhoupt norm [w] A of w. A conjecture was that for a 
fixed Calderon-Zygmund operator T its norm is bounded by C • [w] A , where the constant C depends 



2010 Mathematics Subject Classification. 42B20, 42B35, 47A30. 

Key words and phrases. Calderon-Zygmund operators, A2 weights, Carleson embedding theorem, Corona decomposi- 
tion, stopping time, nonhomogeneous Harmonic Analysis. 

Work of T. Hytonen is supported by the Academy of Finland under grants 130166, 133264 and 218148. 
Work of C. Perez is supported by the Spanish Research Council grant. 

Work of S. Treil is supported by the National Science Foundation under the grant DMS-0800876. 
Work of A. Volberg is supported by the National Science Foundation under the grant DMS-0758552. 

1 



2 



TUOMAS HYTONEN, CARLOS PEREZ, SERGEI TREIL, AND ALEXANDER VOLBERG 



on the operator T (but not on the weight w). Simple counterexamples demonstrate that for the classical 
operators like Hilbert Transform or Riesz Transform, a better estimate than C • [w] . is not possible. 

This linear (in \w] A ) estimate of the norm has become known as the A 2 conjecture. 

For the maximal function, the estimate C • \w] A2 was proved by S. Buckley 0: he also proved that 
this estimate is optimal for the maximal function. The first result for a singular "integral" operator 
was due to J. Wittwer 1431 , who proved the A 2 conjecture for the Haar mutipliers. The same result 
for Beurling-Ahlfors Transform (convolution with 7r _1 z~ 2 in C) was obtained first by Petermichl- 
Volberg |[3T| by using the combination of Bellman function technique and the heat extension, and later 
by Dragicevic-Volberg [71] via the representation of the Beurling-Ahlfors Transform as an average of 
Haar multipliers over all dyadic lattices. 

This result was used in OTTl to answer positively an important question in the theory of quasiconfor- 
mal maps, see [1], about whether a weakly quasiregular map is quasiregular (or equivalently whether 
there is a self-improvement of a solution of the Beltrami equation in the case of critical exponent). 

Then S. Petermichl Il32l proved the A 2 conjecture for the Hilbert transform, again using the repre- 
sentation of the Hilbert Transform as an average of copies of a simple dyadic operator (the so-called 
dyadic, or Haar, shift of complexity 1). 

We should mention here an earlier paper by R. Fefferman and J. Pipher iPTOl . where a linear estimate 
in terms of stronger A 1 norm of the weight w was obtained for he Hilbert Transform. This result found 
its application in geometric questions pertinent to multi-parameter Harmonic Analysis, in particular 
for singular operators on Heisenberg group. The result in P2l is a considerable strengthening of 
Fefferman-Pipher's theorem. 

A recent paper ifiTl by M. Lacey, S. Petermichl and M. Reguera established the A2 conjecture for 
general dyadic shifts. Another proof of the linear bound for dyadic shifts was obtained in Cruz-Uribe- 
Martell-Perez |@), in a very beautiful and concise approach based on a remarkable "formula" by 
Lerner fl8l . Thus, the conjecture was proved for all operators which can be represented by taking for 
each dyadic grid a sum of finitely many dyadic shifts of uniformly bounded complexity (see definition 
below) and taking the average over all grids. 

In particular, as it was shown by A. Vagharshakyan B2l . any convolution Calderon-Zygmund 
operator on the real line E with sufficiently smooth kernel can be obtained by averaging copies of just 
one Haar shift, so the A2 conjecture holds for such operators. 

Note that estimates of the norms of the dyadic shifts obtained in ifTTl and in citeCUMPl, ll5l 
grew exponentially in the complexity of the shift, so it was only possible to estimate the Calderon- 
Zygmund operators obtained by averaging of finitely many such shifts. 

Using linear estimates for the dyadic shifts and a special decomposition (in the form proposed by 
Xiang [441) of a Calderon-Zygmund operator Hytonen-Lacey-Reguera-Sawyer-Vagharshakyan- 
Uriarte-Tuero in lPT4l proved A2 conjecture for all Calderon-Zygmund operator with sufficiently 
smooth kernels (the smoothness was dependent on the dimension in lTT4l ). However, the problem 
for general Calderon-Zygmund operator required (as we shall see) some probabilistic ideas rooted in 
non-homogeneous Harmonic Analysis 11201 . E41 (see also the lecture notes BUI ). 

For general Calderon-Zygmund operators, the last three authors ll35l reduced the A2 conjecture to 
a weak type estimate by establishing the inequality 

In 051 it is also shown that A2 conjecture is equivalent to getting the linear in [w]a 2 estimate on 
simplest test functions (this is a T(l) theorem in the presence of weight). Using this result of Perez- 
Treil-Volberg and the technique developed in lfT7l the first author in lfl2l was able to prove the A2 
conjecture for general Calderon-Zygmund operators, i.e., the following theorem: 
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Theorem 1.1 ( lfl2V ). Let T be a Calderdn-Zygmund operator and w be an A2 weight. Then 

where the constant C depends only on the dimension d, the parameters C cz , OC of the Calderdn- 
Zygmund operator and its norm in the non-weighted L 2 . 

A crucial new element in Ifl2l was a clever averaging trick, allowing one to get rid of the so called 
bad cubes and thus represent an arbitrary Calderon-Zygmund operator as a weighted average of 
(infinitely many) dyadic shifts. This averaging trick was a development of the bootstrapping argument 
used by Nazarov-Treil-Volberg If24l . where they exploited the fact that the bad part of a function can 
be made arbitrarily small. Using the original Nazarov-Treil-Volberg averaging trick would add an 
extra factor depending on [w] A to the estimate, so a new idea was necessary. A new observation in 
lfl2l was that as soon as the probability of a "bad" cube is less than 1, it is possible to completely 
ignore the bad cubes (at least in the situation where they cause troubles). 

The preprint lfl2ll . which itself is neither short or very simple, relies of a rather technically involved 
preprint ll35l . Thus the necessity of a simpler, direct proof, not using the reduction to the weak type 
estimates seems pretty evident. 

Such a direct proof of Theorem ll.ll is presented in this paper; moreover, we obtain new results on 
the dyadic shifts into which the Calderon-Zygmund operator T is decomposed. Indeed, the reduction 
of the A2 conjecture to a testing condition, which in |35ll was made on the level of the Calderon- 
Zygmund operator T, is here performed on the more elementary level of the dyadic shifts in the 
representation of T. The possibility of such a simplification in the proof of the A2 conjecture was 
suggested in |[l2ll . Sec. 8. A, and here we carry out this program in detail. 

The main components of the proof are as follows: 

(i) An averaging trick, which is a version of the one from lfl2l (unlike lfl2l we do not need good 
shifts here, and this simplifies the matter). This trick allows us not to worry about "bad" 
cubes and represent a general Calderon-Zygmund operator as a weighted average of dyadic 
shifts with the weights decaying exponentially in the complexity of the shifts. 

(ii) Sharp estimates, with all the constants written down, in the two weight T(l) theorem from 
ll25l in the setting of dyadic shifts (Theorem 13.41) . Note, that while most of the necessary 
estimates were done in ll25l . a formal application of the result from (25) would give an 
exponential (in complexity) growth of the norm. 

To get the polynomial (in complexity) growth, one needs some non-trivial modifications. 
For the convenience of the reader we present the complete proof, not only the modifications: 
only describing modifications and referring the reader to the proof in ll25l would make the 
paper unreadable. 

(iii) A modification of the proof from ffTTl . which gives polynomial in complexity, instead of 
exponential, as in ifTTl . bound for the weighted norm of the dyadic shift (Theorem l5.ll ). The 
main difference compared to IfTTl is a better (linear in complexity instead of exponential) 
estimate of the (non- weighted) weak L 1 norm of a dyadic shift, which was obtained in lfT2l . 

The rest of the proof essentially follows the construction from [173, keeping track of 
constants, and clarifying parts of the proof that were presented there in a sketchy way. We 
note that a variant of such a modification of IfTTl already appeared in Ifl2l . where it was used 
to verify the required testing conditions for T, but not an explicit norm bound for the shifts 
themselves. 

Aside from the new self-contained proof of Theorem ll.il the above-mentioned Theorems I3.4l and 
15.11 giving sharp quantitative two-weight and one-weight bounds for dyadic shifts, are the main new 
results of this paper. 
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2. Dyadic lattices and martingale difference decompositions. Random dyadic 

lattices 

2.1. Random dyadic lattices. The standard dyadic system in M. d is 

^° := |J 2>l 3$ := {2*([0, \) d + m) : m G Z d }. 

keZ 

For / G ^ and a binary sequence co = G ({0, \} d )' L , let 

7+o> :=/ + £ co ; 2 ; . 

Following Nazarov, Treil and Volberg |[24l Section 9.1], consider general dyadic systems of the form 

9 = 0°:= {I+co : I G 0°} = (J . 

Given a cube / = x+ [0, let 

ch(7) := {x+T]£/2+[0,£/2) d : T] G {0, l} d } 
denote the collection of dyadic children of I. Thus 3>®_ x = U{ch(7) : I G 9®}. Note that, in line with 



11241 but contrary to lfl2l . we use the "geometric" indexing of cubes, where larger k refers to larger 
cubes, rather than the "probabilistic" indexing, where larger k would refer to finer sigma-algebras. 

Consider the standard probability measure on {0,1 which assigns equal probability 2 d to every 
point. Define the measure P on ({0, l} d ) z as the corresponding product measure. 

2.2. Martingale difference decompositions and Haar functions. For a cube I in W 1 let 
V := (ffdx) 1 7 := ( \I\- 1 f fdx) l /5 A, := -E, + £ E,. 

\Jl / V Jl J y ec h(7) 

It is well known that for an arbitrary dyadic lattice 9 every function / G L 2 (W ! ) admits the orthogonal 
decomposition 

/ = I V- 

We also need the weighted martingale difference decomposition. Let pL be a Radon measure on 
Define the weigted expectation and martingale differences as 

E?f := ( (M/))- 1 / /rf/i) 1„ A? := -E? + £ E^; 

V Jl / Jech(l) 

for the defmiteness we set E^/ = if pL (I) = 0. 

For an arbitrary dyadic lattice 9 and ^eZ, any function / G L 2 (ji) admits an orthogonal decom- 
position 

(2-1) /= £ Wjf + £ Aj 1 / 

leS):l{I)=2 k IeS):£{I)<2 k 

Given a cube Q in R d , any function in the martingale difference space A^L 2 is called a Haar 
function (corresponding to Q) and is usually denoted by h^. Note, that denotes a generic Haar 
function, not any particular one. 

A generalized Haar function is a linear combination of a Haar function and 1^. In other words, 
a generalized Haar function /j^ is constant on the children of Q, but unlike the regular Haar function 
it is not orthogonal to constants. 

Similarly a function h G A^L 2 (/i) is called a weighted Haar function and is denoted as h^. 
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3. Dyadic shifts. A sharp two weight estimate 
Definition 3.1. An unweighted dyadic paraproduct is an operator IT of the form 

n /= L(V) ; V 

where are some (non-weighted) Haar functions. 

Definition 3.2. Let m,n G N. An elementary dyadic shift with parameters m, n is an operator given 
by 

S/:=£ I \Q\~\f,h%)h% 

Qe$> Q 1 .Q"e9,Q' .Q"<ZQ. 

i(Q')=2-'»e(Q),£(Q")=2-"e(Q) 

where kg, and hg„ are (non-weighted) Haar functions for the cubes Q' and Q" respectively, subject 
to normalization 

(3- 1) 1 1 °° * 1 1 ^Q" 1 1 00 — ^' 

Notice that this implies, in particular, that 

(3.2) §/(*)= XI l^r 1 / a Q (x,y)f(y)dy, suppa cQxQ, |U < 1 , 
where 

(3.3) a G (x,y)= £ h%„(x)h%,(y). 

Q / ,Q"e$,Q',Q"cQ, 
e(Q')=2-'"t(Q),£(Q")=2-"e(Q) 

The number max(m,«) is called the compexity of the dyadic shift. 

Definition 3.3. If in the above definition we allow some (or all) , h^„ to be generalized Haar 
functions, we get what we will call an elementary generalized dyadic shift. 

A dyadic shift with parameters m and n is a sum of at most (2 d ) 2 elementary dyadic shifts (with 
parameters m and n). If we allow some (or all) of the elementary dyadic shifts to be generalized ones, 
we get the generalized dyadic shift. 

Remark. The paraproduct IT is an elementary generalized dyadic shift with parameters 0, 1, provided 
that ||/Zg||oo < 1 for all cubes Q. 

Remark. The main difference between dyadic shifts and generalized ones is that a dyadic shift is 
always a bounded operator in L 2 (assuming the normalization (13.ll )). while for the boundedness of a 
generalized dyadic shift some additional conditions are required. 

We always think that our dyadic shifts S are finite dyadic shifts meaning that only finitely many 
<2's are involved in its definition above. All estimates will be independent of this finite number. 

In the present section we consider a two weight T(l) theorem for dyadic shifts. We fix two mea- 
sures jU, v on M. d . Finite dyadic shifts are integral operators with kernel 

H X J) = L a Q( x >y)> 

QeS> 

the sum being well defined as it is finite. We define now 

Stf(x) := / ' A{x,y)f(y)dn(y), 
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and its adjoint S* 

Kg(y) = J A(x,y)g(x)dv(x). 

We need the notation 

\ji,v]a 2 :=sup(Ai)/(v) 7 , 
/ 

where (a)/ := |/| _1 <7(/). 

The following theorem is the first new main result of this paper. It is essentially a quantified version 
of Theorem 2.3 of IT251 . 

Theorem 3.4. Let § be an elementary generalized dyadic shift with parameters m and n. Let us 
suppose that there exists a constant B such that for any Q £ @ we have 

(3.4) f \^l Q \ 2 dv<Bn(Q), [ \%l Q \ 2 dii<Bv(Q). 
JQ JQ 

Then 

(3.5) ||S M /|| v <c(2 rf / 2 (r+l)(B 1 / 2 + [ At ,v]if)+r 2 [ At ,v]y 2 2 
where r = max(m,?i), and C is an absolute constant. 

The idea of the proof of this theorem is quite simple. The operator S' 1 is represented essentially as 
the sum of weighted paraproducts, which are estimated using condition (13.41) and the operator with 

1 12 

finitely many diagonals, which is estimated by C[jU, v]^ . 

Take two test functions f,g. Using martingale difference decomposition (12.11 ) we can decompose 



/= I I AJ/, g= £ Ejg + £ A**. 

Q£9:£(I)=2 k Qd@:i{I)<2 k Q^9:l(l)=2 k Qe@:l(I)<2 k 

We want to estimate the bilinear form (S tl f,g) v . We will first concentrate on the nontrivial case 
/ = Efi€» A g/. g = Ig^A^g; adding the terms LQe$m=2 kE ^f and LQe&m=2 kEV Q S wil1 be 
easyQ 

3.1. Weighted paraproducts. Fix an integer r. Then the paraproduct IT 1 = ITg, acting (formally) 
from L 2 (ji) to L 2 {v) is defined as 

n":=£Eg/ £ A^l e . 

Qe® ReSf.RcQ, 
i(R)=2- r £(Q) 

The paraproduct n v = FDL, acting (formally) from L 2 (v) to L 2 (n), is defined similarly 

rr:=£E£/ £ a^;i 2 . 

Qe@ ReS?,RcQ, 
l{R)=2- r l{Q) 

Notice that if r > n, then for any / £ L^/i) such that / \ Q = 1, and for any R £ £F such that RC Q 
and £(R) < 2- r £(Q), we have 

(3.6) A^S M / = A^S M l fi . 



In fact, we will only apply this theorem in the situation when a martingale difference decompositions not involving 
and are possible. 
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Indeed, in the decomposition 

(s M (i G -/),^) v = £ £ (i g -/,V)m(V'^)v 

/eS? I 1 J"e9j' J"<zl 

£(l')=2- m e(I)J(l")=2-"£(I) 

only the terms with I' Q and /" C can give a non-zero contribution. But the inclusions /" C R C Q 
together with size conditions on /" and R imply that 

1(1) = 2 n £{l") < 2 r £(l") < 2 r £(R) < £{Q), 

so / C Q (because in Q D I" 7^ 0, so the inclusion of the dyadic cubes is determined by their sizes). 
But the inclusion I C Q implies /' C Q, so the conditions /' <£_ Q and /" C R are incompatible. 

The equality ( 13.61 ) means that for r > n we can replace 1^ by 1, bringing our definition of the 
paraproduct more in line with the classical one. 

Lemma 3.5. Let Q,R€ @, and let r>n. Then for the paraproduct TV 1 = rig* defined above 

(i) If £{R) > 2~ r £(Q) then (Yl^hQ,h R ) v = Ofor all weighted Haar functions h^ and h v R . 

(ii) IfR <f_ Q, then (Yl^hQ,h R ) v = Ofor all weighted Haar functions hg and h R . 

(iii) If £{R) < 2~ r £(Q), then for all weighted Haar functions /zg and h R 

(Il%,h v R ) v = (S^,h v R ) v ; 
in particular, ifR <f_ Q, then both sides of the equality are 0. 
Proof. Let us use Q' and R' for the summation indices in the paraproduct, i.e. let us write 

n^:=I« I A£S„1 

0e& R'e3>. R'cQ', 

e{R')=2- r e(Q') 

Since h R is orthogonal to ranges of all projections A R , except A R we can write 
0.7) <n%^) v = {{E^ Q )^ ll l Qt y R ) v =a(%l Ql ,h v R ) v 

where Q' is the ancestor of R of order r (i.e. the cube Q' D R such that £(Q') = 2 r £(R)) and a is the 

value ofEth^ on Q', E^M = a\ nl . 

q> q Q' Q Q' 

It is easy to see that E^h^ ^ (equivalently a 7^ 0) only if <2' ^ Q. Therefore, see ( 13.71 ), 

(n%/^) v ^o 

only if Q' ^ Q and statements (i) and (ii) of the lemma follow immediately. 

Indeed, if £{R) > 2- r £{Q) and £(Q') = 2 r £(R), the inclusion Q"^Q is impossible, so 

(U%,h v R ) v = 0, 

and the statement (i) is proved. 

If R *t- Q> th en ^ inclusion Q' ^ Q (which, as it was discussed above, is necessary for 
(Yl^hQ,h R ) v 7^ 0) implies that R <f_ Q' . This means that Q' is not an ancestor of R, however (13.71 ) 
again shows that for Q/ to be an ancestor of R is necessary for (Yl^h^,h v R ) v ^ 0. 

Let us prove statement (iii). Let £{R) < 2~ r £{Q). \iR<£Q then by the statement (ii) of the lemma 
(U^h^h^v = 0. On the other hand if M is the ancestor of order r of R, then QDM = 0, thus by 
(1331) 

(S^,^) V = (S M 0-1 M ,^) V = 0. 
So, we only need to consider the case R C Q. 
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Let Qi be the "child" of Q containing R (i.e. RcQiCQ, £{Q\) = i(Q)/2), and let b be the value 
of ^ on Q\. Then, since £(R) < 2- r £(Qi), (O implies that 

(Sfih^h-^v = b{S^l Qi ,h v R )y 
On the other hand we have shown before, see (13.71) that 

{u%y R ) v = {{E^ Q )^i Ql y R ) v 

where Q' S @ is the ancestor of order r of R, meaning that R C Q' , £(Q') = 2 r £{R). Therefore 
Q' C <2i and so Eq,1iq = bl^,. We also know, see (13.6I ). that because 2' C <2i we have equality 
A^S^lg, = A^S^l^ . Thus we can continue: 

(U%,h v R ) v = b(A v R S li l &) h v R ) v =b(A v R ^l Qi ,h v R ) v =b(S^l Qi ,h v R ) v . 

Therefore (U^hQ,h R ) v = (S^hQ,h R ) v , and the lemma is proved. □ 

3.2. Boundedness of the weighted paraproduct. We will need the following well known theorem. 
Let f R := f R fdn be the average of the function / with respect to the measure ji.. 

Theorem 3.6 (Dyadic Carleson Embedding Theorem). If the numbers a^> 0, Q € @, satisfy the 
following Carleson measure condition 

(3-8) £ fl Q <fi(R), 

QcR 

then for any f G L (;U) 

I« R i/ R i 2 <4-imi^). 

Re® 

This theorem is very well known, cf (H. Usual proofs are based on a stopping time argument 
and the dyadic maximal inequality; the constant 4 appears as 2 2 , where 2 is the norm of the dyadic 
maximal operator on L 2 (jU). For an alternative proof using the Bellman function method, see |[20l . 
It was also proved in ||28l that the constant 4 is optimal. We should mention that in |[T9ll , ||28l this 
theorem was proved for Mr, but the same proof works for general martingale setup. A proof for M 2 
was presented in GUI , and the same proof works for W l . 

Let us now show that the paraproduct IT = rig is bounded. Ranges of the projections A R are 
mutually orthogonal, so to prove the boundedness of the paraproduct Og it is sufficient to show that 
the numbers 

a Q '■= II^K^M^IIl 2 ^) 
Re^.RcQ 

£(R)=2- r e(Q) 

satisfy the Carleson Measure Condition (13.81) from Theorem l3.6l Let us prove this. 
Consider a cube Q. We want to show that 

£ £ i|A^i e ii 2 2(v) <s M (e). 

Oc5 Re&,RGQ 

By (13.61) we can replace 1^ by L_, so the desired estimates becomes 

I II WAv) < I II Wgl&oo < ||i g s M i ||^ (v) . 

#eS>,ScQ_ RcQ 
e.(R)<2-'l(Q) 
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By the assumption of Theorem [3~4l see ( 13.41 ), 



and so the sequence (Iq, Q £ & satisfies the condition (13 - 8b - Thus the norm of the paraproduct IT 1 is 
bounded by CB 1 / 2 (we can pick C = 2 here) and similarly for IT V . □ 

3.3. Boundedness of §: essential part. Let / e L 2 (/i), g e L 2 (v), ||g|| v < 1. We want to 

estimate |(S M /,g) v |. 

Consider first / and g of form 

/=!>£/, *=£A&, ||/|| M <1, ||g||v<l. 
0e® Re® 

Then by Lemma [331 

(3-9) (s M /,g) =(I^/^) v + (/,I^.g) v + £ <S M Ag/,A&) v 

2- r <f(R)/£(e)<2 r 

We know that the paraproducts ITg and rig, are bounded, so the first two terms can be estimated 

together by 4B 1 / 2 . Thus it remains to estimate the last sum. 
It is enough to estimate the operator S 

(Sf,g) v ■= £ <S M Ag/,A&) v 

Q,Re& 
2-'l{Q)<l(R)<l{Q) 

because the sum over 2~ r £(R) < £{Q) < £(R) is estimated similarly. The operator S can be split as 
S = L£ = o^> where the 

{Skf,g)v ■= £ <S M Ag/,A^) v 

£(R)=2-^(Q) 

Each S# can be in turn decomposed as S k = Hjez^kj, where 

(S kJ f,g) v := £ {S^fAlgh 
Q.Re® 

l(Q)=2' 

e(R)=2J- k 

For a fixed & the ranges Ran S^j, j € Z are mutually orthogonal in L 2 (v), and the dual ranges RanS^ ., 
y 6 Z are mutually orthogonal inL 2 (/i). Therefore ||5* : ||< max ye ^ so we only need to uniformly 

estimate individual operators Skj- 
So, if 

fj= I A£/, «j_ t = £ A^g 

Qe@:l{Q)=2J Re@:t{R)=2J- k 

it is sufficient to estimate (S k jfj,gj- k ) v = (§nfj,gj-k)v- 

We can decompose the operator S^j into interior and owter parts 

(Skjf,g)v= £ (S M AJ/,A^) V + £ (S M AJ/,A^) V 

Q,Re9:R<zQ Q,Re2>:RnQ=0 
l(Q)=2> ,l{R)=2'- k £{Q)=2 J l(R)=V- k 



■ (s k %g)v + (sTjf lg ) v 
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i-k 



Let us estimate S^ 1 . For cubes Q,R e Si, R<T\ Q = 0, £{Q) = V, £{R) = V~ k and the coiTesponding 

weighted Haar functions h 1 * and h v n we can write 

2 « 

(3.10) { S °^ h v R ) v = {^ h v R ) v = £ \M\- 1 [ a M (x,yW(y)K?(x)dfi(y)d\ 

u u up® J MxM ^ 



/V JC 



where the kernels a^f are from (13.21) . 

If £(M) < £{Q) = V, then the cube M cannot contain both Q and 7? (because RCi Q = 0), so the 
corresponding integral in (13.101) is 0. On the other hand, if £{M) > 2 r £(Q), r = max(m,«) being 
the complexity of the dyadic shift §, then for any x the function a M (x, •) is constant on Q, so the 
corresponding integral in (13.101) is again 0. 

So in (13.101 ) we only need to count M, 2 ; < £(M) < 2 7+r , and therefore we can write 

KO^>|= £ I A s (x,ymy)W(xW(y)dv(x) 

j+r 



< 

s=j+l 



£ I \A s (x, y )\-\hH y )\-\h v R (x)\dfi(y)dv(x), 



where A s (x,y) := £mg^(m)=2» M l a^ (x,y). 

Adding extra non-negative terms (with R C Q) we can estimate 



\(sijfM < I I / \M*,y)\ ■ |AJ/(y)| • |A^(x)|</M(y)rfvW 



7'+'- 

I 

■«=/'+ 1 Q,ReS>:RnQ 

1{Q)=2> \l{R)=V~ k 

j+r 



*=/+ 1 

But each integral operator with kernel |A S | is the direct sum of the operators with kernels \M\ _1 \a M \ , 
Mef, £(M) = 2 s (recall that a M is supported on M x M). 
Since HamII^. < 1 we can estimate the Hilbert-Scmidt norm 



/ \M\ 2 W M {x,y)\ 2 dll(y)dv{x) <[H,V 

JMxM 



A 2 - 



so the norm each operator with kernel \M\ l \a M (x,y)\ is at most [jU,v] A ' . Therefore the norm of 

1 /2 

each operator with kernel (x, y) | is estimated by [/i, v] A , and summing in s we get 
0- n > ll^ll i2(MHL2(v) <KM,v]y 2 2 

To estimate the norm of S™ 1 , we need the following simple lemma 
Lemma 3.7. In the assumptions ofTheorem \3.4\ 

\\l Q ^ Q \\l<2\B + A[^MAMh%- 

for any \l-Haar function h^. 

Proof. Let Qk, k = 1,2, . . . ,2 d be the dyadic children of Q. A jit-Haar function /i^ can be represented 
as 

2^ 2^ 

(3.12) h o = L £ 0W*(G*) = 0. 

k=\ k=\ 



A 2 CONJECTURE 



11 



and 

(3.i3) ra 2 M =£MV(<2*)- 

k=i 

By assumption (13.41 ) of Theorem l3.4l 
(3-14) \\l Q S^ Q f v <B^(Q k ). 
Let us estimate 111 . n S u l„ || v - We know that 

M&(*) = I l M r' / a M {x,y)l Qk {y)dpL{y). 

Me& JQk 

Since the functions a M are supported onMxM, only the terms with M D Q can give a non-zero 
contribution for x ^ Q k . Therefore, summing the geometric series we get that 

-l 



|S M l a (*)|<2M(fit)|Gr V*£g*. 



Then 



\\l QXQ ^l Qk \\ z v <4n(Q k y\Q\-'v(Q), 
and combining this estimate with ( 13.141 ) we get 

||l fl S M l a \\l < Bii(Qk) + 4ix(Qk) 2 \Q\~ 2 v(Q) 

<B^{Q k ) + A^{QMQ)\Q\- 2 v{Q) 

<(B + 4[n,v]A 2 MQk) 
Therefore, we can get recalling (13.121 ) and (13.131) 

||l e S^llv<Il^ll|l e S^ e J|v 



k=\ 



<(B + 4[ J u,v] A2 ) 1 / 2 £|a,| J u(G,) 1 / 2 
2 rf / 2 (B + 4[At,vU 2 ) 1/2 l 



1/2 



< 



/t=i 



Using the above Lemma l3?7l we can easily estimate S k nt j. Namely, 



\Kf£ = I 



I 



RcQ:£(R)=2i- k 

Qe^:i(2)=2^' 
<2 d (S+4[ M ,vU 2 ) £ A£/ 

ee»:f(e)=2> 

*i/n , /ir.. ,.i mi r n2 



2*(* + 4[/i,v] i O||/)||£. 



□ 
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Combining this with the estimate ( 13.111 ) of we get that 

Pkj\\ _ . < 2 d ' 2 (B + A[^ v] A2 ) 1 /2 + r[At)V] i/2 ) _ 

L-{ix)^L\v) 

Since the operator Sk is the orthogonal sum of Skj, we get the same estimate for \\Sk\\- To get the 
estimate for ||5||, S = JJ k=0 Sk, we just multiply the above estimate by r+ 1. 
Adding in (13.91 ) all the estimates together we get that for / and g of form 

/=£Ag/, g= J £A v R g, ll/IU<Msllv<i, 

QeS> Re@ 

we have 

(3.15) |(S M /,g} v |<4B 1 /2 + 2.( r+ i)[2^/ 2 (B + 4[M,vU 2 ) 1 /2 + r[;U)V ]V2 ]; 

the first term here comes from the paraproducts, and the extra factor 2 in second term is to take into 
account the sum over £{Q) < £(R) in d3T9l ). 

3.4. Boundedness of §: some little details. We are almost done with the proof of Theorem 13-41 
modulo a little detail: for arbitrary measures /I functions / G L 2 (jU) do not admit martingale difference 
decomposition / = Y,q e %, Aj^/. 

Each compact subset of M. d is contained in at most 2 d cubes of the same size as the size of this 
compact subset, so let Q^, k= 1,2, . . . ,2 d be the dyadic cubes of some size 2 N containing supports of 
/ and g. The correct decomposition is given by (12.11) which reads as 

(3-16) /= £ E£/ + £ AJ/ 

Qe9:l{Q)=2 k Qe®:£(Q)<2 k 

(here k is an arbitrary but fixed integer), and similarly for g G L (v). 
(3-17) g= £ E v g+ £ A v Q g. 

Qe$>:£{Q)=2 k Qe@:l{Q)<2 k 

so we need to estimate some extra terms. Of course, in the situation when we apply the theorem 
(<ijU = wdx, dv = w~ l dx, w satisfies the A2 condition) / and g can be represented via martingale 
difference decomposition, although some explanation will still be needed. 

Fortunately, there is a very simple way to estimate the extra terms. Let us say that dyadic cubes 
Q,R G Ql are relatives if they have a common ancestor, i.e. a cube Me^ such that Q,R C M. The 
importance of the notion of relatives stems from the trivial observation that if the cubes Q and R are 
not relatives, then S^l^ = on R. 

It is sufficient to prove the estimate on a dense set of compactly supported functions. For compactly 
supported functions / and g only finitely many terms Ej^/ and E^g in the decompositions (13.161) and 
(13.171 ) are non-zero. Let us slit the collection of corresponding cubes into equivalence classes of 
relatives, and for each equivalence class find a common ancestor (it is always possible because of 
finiteness). 

Denote by si the set of these common ancestors. Then we can write instead of (13.161 ) and (13.171) 
(3-18) /= £e£/+£ £ A£/=:/ e +/ dj 

(3-19) g=LK8+L L A v R g=:g e + g d ; 
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the indices "e" and "d" here mean expectation and difference. Let us decompose 

(§tif,g)v = (^n(fe+fd),g S +gd)v 

= (S M / e ,g)v + vV/d,ge)v + (§/x/d,gd)v 

The last term is estimated by (13.151 ) (note that ||/|| 2 = ||/ e ||u + ||/d||^ an d similarly for ||g||y), so we 
just need to estimate the first two terms. 

Any two cubes Q, Q' € srf , Q / Q' are not relatives, so as we already mentioned S^l^ = on any 
Q' e &/, Q' / Q. Therefore 

|(S M E£/,s) v | = |(S^/,gl fi ) v | < ||l Q S M E^/|| v ||gl e || v 

^^HEj/IUHglgllv 

(we use assumption (13.41 ) of theorem 13.41 for the last inequality). Summing over all Q G and 
applying Cauchy-Schwarz inequality we get 

\{%f*M = I I(W^)vI<b 1/2 I l|EJ/IUII«i fl llv 

<B 1/2 n/eii,ikiiv<B 1/2 imuikiiv 

Similarly 

|(SM/d,ge)v| = K/d,Ske)v|<B 1/2 ||/d|Ulke||v<B 1/2 |mUlk||v, 

so in general case we just need to add 2S 1 / 2 to the right side of (13- 15b - 

4. Dyadic shifts and random lattices 

In this section we use a probabilistic approach to decompose an arbitrary Calderon-Zygmund 
operator as an average of simple blocks, namely, the dyadic shifts investigated above. More precisely, 
we prove the following result, which is a variant of lfT2l . Theorem 4.2. The decomposition here is 
easier than in lfT2l . and there is a reason for that: the shifts in |[T2l needed to have an extra geometric 
property pertinent to being applied in conjunction with (35]. Here we do not need that as we are not 
basing our reasoning on a weighted Tl theorem of 11351 . The idea of such decomposition goes back 
to methods of non-homogeneous Harmonic Analysis exploited in |[24l or fiTTl for example. 

Theorem 4.1. Let T be a Calderon-Zygmund operator in W 1 with parameter a. Then T can be 
represented as 



T=C f £ 2 -(m+n)a/2 ga^ 



where n is a dyadic shift with parameters m,n in the lattice the shifts with parameters 0, 1 and 
1 , can be generalized shifts, and all other shifts are the regular ones. 

The constant C depends only on the dimension d and the parameters of the Calderon-Zygmund 
operator T (the norm \\T\\ 2 2 , the smoothness OC, and the constant C cz in the Calderon-Zygmund 
estimates). 

4.1. Getting rid of bad cubes. Let (O € Q. be the translated dyadic lattice in R d as defined in 
Section l2TT1 and let P be the canonical probability measure on Q. (also defined in Section l2"TI ). 
Fix ro G N. Let y = 2 ( c p ra ) > where a is the Calderon-Zygmund parameter of the operator T. 

Definition. A cube Q £ @ m is called bad if there exists a bigger cube R £ Ql a such that £(Q) < 
2- r H{R) and 

dist(e,/?)<^(e) r ^) 1 - r . 
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Let us introduce some probabilistic notation we will use in this section. Let E = E^ denote the 
expectation with respect to the probability measure P, 

EnF = E n F((B)= f F(co)dP(a}); 
Ja 

slightly abusing the notation we will often write EqF(co) to emphasize that F is a random variable 
(depends on ft)). 

For k G Z let 2U be the sigma-algebra generated by the random variables ft)y, j < k, and let Egi t 
be the corresponding conditional expectation. Because of the product structure of f2, the conditional 
expectation Es^ is easier to understand: it is just the integration with respect to a part of variables (Oj. 

Namely, for k G Z one can split ft) = ( k (0,(0 k ), where k (0 := (tO;)y<£, (O k := (G>j)j>h so ^ i s rep- 
resented as a product CI = H x £2*. Note that the sets O and Cl k are probability spaces with respect 
to the standard product measures. We will use the same letter P for these measures (probabilities), 
hoping that this will not lead to the confusion. 

Denote by Q. k [ k Co) the "slice" of Q., 

f2*fft>] = {( k (0,(O k ) : co k G Q*}. 
Then for almost all k co, assuming that ft) = ( k co, (O k ) we have 

(E^F)(G))=E QW F:= [ F( k CO,® k )dP(a> k ), 

Jo* 

so the conditional expectation E^ is just the integration over slices. 

Finally, given a cube Q G £(Q) = 2 k , denote by Q.[Q] the slice D,[Q] := D, k [ k (o] for the particular 
choice of the parameters k (0 = (ft)y)y<,t determining the position of Q (and of all cubes of size 2 k ). 
The notation E^g] then should be clear, and one also can define the conditional probability 

P{event|Q} := E n[e ]l e vent- 

Lemma 4.2. 7T bad = 7T bad (r , y,d) := P{Qis bad|2} < C{d)2- cr °. 

In words: given a cube Q, the probability that it is bad is a constant depending only on ro, y and d, 
and can be estimated as stated. 



Proof. The proof is an easy exercise for the reader. □ 

For now on let us fix a sufficiently large ro such that 7T bad < 1 , so the probability of being good 
satisfies 7r g0 od — 1 — ^bad > 0. 

Lemma 4.3. Let T be a bounded operator in L? = L?{W i ,dx). Then for all f,g£ Cq 

(Tf,g) = 71-^1 £ {T^f : A jg )dV{co) + K^ J £ (TA I f 1 A J 8)dP((0) 

t{i)<i{J) l(J)>l(J) 

I is good / is good 

Proof. It is more convenient to use probabilistic notation in the proof. Let 

/good, to := £ Ajf. 
I is good 
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Then for any f,geL 2 , 

E Q (/good,Q),g) = E n £ (\f>\g) 
I is good 

= £E fi E 2lt £ (A,/^) 

teZ leS> (? -l(l)=2 k 

I is good 

= lE a E 2lt £ (A 7 /,A /g }l . (0)). 

To compute the conditional expectation let us notice that the position of the cubes / G 5?^,, = 
2 k depends only on the random variables (Oj, j < k. On the other hand, the event that a cube / G 
^(o, (■{!) = 2 is good depends only on the variables CO,-, j > k, and for fixed variables to,-, j < & 
the corresponding conditional probability of this event is 7r goo d, so we can write for the conditional 
expectation 

(4-1) E at l {/isgood} ( W ) = 7T good . 

Therefore 

E a , £ (A / /,A / g)l {/ . sgood} ( W ) = ^ good £ <A 7 /,V>, 
which gives us 

(4-2) E a (/good,to,^) = TTgood {f,g) • 

Applying this identity to (Tf good ^,g) = (f goo d,m, T*g) (with T*g instead of g) we get 
x g ood{T f,g) =Ea(r/ good , £0 ,g) 

= E a £ <7A/,A 7 g> + £EaE £ (TA ; /,V}1 

t{i)<i{J) e(i)=2 k ,i(i)>£(j) 

I is good 

(4.3) =E n £ (rA / /,A j5 ) + ^ good E n £ {TAj^Ajg}; 

l(T)<i(J) l(T)>l{J) 
I is good 

here again in the last equality we used (14.11 ) and the fact that for 2 k = 1(1) > £(J) the position of / and 
/ depends on the variables (Oj, j < k, while the property of / depends on the variables (Oj, j >k and 
is not influenced by the position of J. 

Remark 4.4. To justify the interchange of the summation and expectation E^ in (14.31 ) we first observe 
that for smooth / 

C(d)\\Vf\U(I) 1(1) <h 

£(i)>i. 

So, if we denote 

fw • ^ Ajf, /good,(B • £ Ajf, 

le@ a -l(l)=2 k leZ> a -l(l)=2 k 

I is good 

then, integrating the previous estimates we have for / G 

Wfji?, \\f k good , a y<C(f)min{2\2- kd }, 
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SO 



"L\\fiu<c{f), Lii4ood,jiL2<c(/). 



y, I (T f good, a 18 co 

)| < \\T\\C(f)C(g), 



Then for f,g£ C£ 



which justifies the first interchange of summation and integration in ( 14.31 )- The same estimate holds if 
we replace f k ood a by and this justifies the second interchange. 

Note also that the sum has at most C(f,k) non-zero terms Ajf (where C(f,k) < 00 does not 
depend on ft)), so for fixed k and j we can interchange summation over /, 1(1) = 2 k and integration 
without any problems. 

Let us continue with the proof of Lemma 1431 Since for all CO £ £2 

(Tf,g)= £ (T^fAjg), 

averaging over all ft) we get 

(4.4) (Tf,g)=E a £ {TAjf^+Ea £ (TAJ^g). 

£{I)<t(J) l{T)>l{J) 



Multiplying this identity by 7r g0 od and comparing with (14.31 ) we get that 

(4.5) 7T good E a £ (TA I f,A J g)=E a £ (TAjf,A jg ). 

l{i)<i{J) l{t)<&(J) 

I is good 

Remark. Note, that the above identity cannot be obtained by directly applying the above trick with 
the conditional expectation to the right side. If 2 s = 1(1) < £(J) = 2 k , then the position of / and J is 
defined by the variables <Oj, j < k, and the property of / being good depends on (Oj, j > s. Thus the 
conditional probability of / being good depends on the mutual position of / and J and so there is no 
splitting we used proving (14.2I ). (14.31) . 

We can repeat the reasoning leading to (14.41 ) without any changes to the splitting into £(I) < £(J) 
and 1(1) > £(J) to get 

% ood Ea £ (TA I f,A J g)=E Q £ (TA^Ajg). 

t{t)<l{J) t(i)<e(.J) 

I is good 

From the symmetry between / and J we can conclude that 

(4-6) TTgoodEa £ (TA I f,A J g)=E n £ (TA^Ajg). 

t{t)>t{J) e(i)>t(J) 

J is good 

Substituting d4~3l ) and d4~4] ) into d4~6] ) we get 

(Tf,g)=E a £ (TA I f,Ajg)+Eo, £ (TAJAjg) 

i(I)<£(J) l{I)>t{J) 

= ^good E n £ (TAjf^ + K^fia £ (TA^Ajg) 

£(i)<i(J) l{T)>l{J) 

I is good J is good 



A 2 CONJECTURE 



17 



□ 

4.2. Subtracting paraproducts. For a Calderon-Zygmund operator T in L 2 (W 1 ) and a dyadic latt- 
tice define the dyadic paraproduct IT® 



Here A^Tl is defined by duality, 



(A Q Tl,g):=(l,T*A Q g) VgGL 2 ; 

the right side here is well defined, as one can easily show that T*Agg G L 1 . (This is a pretty standard 
place in the theory of Calderon-Zygmund operators.) 
Define operators T m 

f a :=r-n^-(n^)* 

Remark 4.5. The matrix of the paraproduct IT^ has a very special "triangular" form. Namely, a block 
A R UfA Q , Q,R£ Sid, can be non-zero only if R C Q. Notice also, that if £(Q) = 2 k , then the block 
A fi IT®Ag does not depend on the variables COj, j > k. 

From the above observation is easy to see that if Q,R G $> m , ma.x{£(Q),£(R)} = 2 k , then the block 
A^r^Ag does not depend on variables (Oj, j > k, and that 

A R f m A Q =A R TA Q 

if Qr\R = ot Q = R. 

The paraproducts were introduced in Calderon-Zygmund theory in the proofs of and T(b) 
theorems. The main idea is that one can estimate the operators T m by estimating the absolute values 
of the entries of its matrix in the Haar basis, but one cannot, in general, do the same with paraproducts 
(and so with a general Calderon-Zygmund operator T). The papraproducts, however can be easily 
estimated by the Carleson Embedding Theorem, using the condition T\ G BMO (Tb G BMO). 

Definition. Let D(Q,R) be the so-called long distance between the cubes Q and R, see 1241 . 

D(Q,R) :=dist(Q,R)+£(Q)+£(R). 

Lemma 4.6. Let T be a Calderon-Zygmund operator (with parameter a), and let Q,R G Si®, £(Q) < 
£(R). Let h^ and h R be Haar functions, = \\h R \\ = 1. If Q is a good cube, then 

where C = C(ro,d,Ct,C cz ) < °°. 

The proof is pretty standard, see 11241 for example. 

Lemma 4.7. LetC = C(ro,d,(X,C cz ) ^ be the constant from the above Lemma W^ and let \oq r \ < 
1. Then for any dyadic lattice and for any m,n G Z + , m>n the operators 

C 1 V V a 2 (m+n)a/2 D(QiR) d+a a f A 

C L L q.r £(M) d+a R Q 

e{Q)=2-"'£(M) 
£(R)=2-"i(M) 
Q is good 

is a dyadic shift with parameters m, n, and the same holds if we replace A R T m AQ by A^T^A^. 
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Proof. We will need the notion of the standard Haar basis here. For an interval / C R let h® := 
|/| _1 / 2 1 7 , and let h 1 be the standard L 2 -normalized Haar function, 

h\ ^i/i-v^-i^), 

where I + and /_ are the right and the left halves of / respectively. 
For a cube Q = I\ x I 2 x . . . x Ij E M. d and an index j, < j < 2 d , let 

d 

u k=\ * 

where j k £ {0, 1} are the coefficients in the binary decomposition 

The system h) , j = 1, . . . ,2 d — 1 form an orthonormal basis in A„L 2 , which we will call the stan- 
dard Haar basis. 

Note that h° = \Q\~ l/2 l Q . 

The block A R T m can be represented as 

A R f m A = 2 fc u (Q,R)(.,h k )hJ R 

j,k=l 

where c Lk {Q,R) = (f m h k Q ,h R ). 

Since ||/t^||«> = |2| -1 ^ 2 we can estimate using Lemma l4~6l 

l(0) a / 2 £(R) a / 2 

( 4 - 7 ) \cj, k (Q^\-\\h k Q \U-\\h R \\^<C ^[q^I , 

where C = C(ro,d, ot,C cz ) is the constant from Lemma l4~6l 

Clearly for fixed j,k and the constant C from Lemma |4~61 we can write 

c 1 I I v« 2(m+ " )a/2 -^|§^^(e,/ ? )(-,// G K 

ME&C, Q,Re%l a :Q,R<ZM C \ m > 

i{Q)=2~ m l(M) 

e(R)=2- n e(M) 

Q is good 

= I I (-^ Q )h R 

Me9 a Q,Re2> a :Q,RcM 

e(Q)=2- m e(M) 
e(R)=2- n e(M) 

Q is good 

where and h R are multiples of h k and /i^. This sum has the structure of an elementary dyadic shift, 
and to prove the lemma we only need to estimate ||oo||/i R ||<x>. 
Using ( 14.71 ) we get for fixed cubes Q and R 

Z{Q) al2 m al \ {m+ n)a,2 D(Q,R) d+a 
I'VHI"*!!-- D (Q jR )d+a Z • £(M)d+<* 

i m) a/2 m) a/2 _ 2{m+n) a/2 _ i 



£(M) rf £(M)« £(M)<*' 

because £(Q)/£(M) = 2-' n , £(R)/£(Af) = 2~ n . 

So, the above sum is indeed an elementary dyadic shift with parameters m, n. Summing over all 
j, k we get the conclusion of the lemma □ 
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4.3. Proof of Theorem 14.11 As we explained before, see Lemma |4~3l we can represent T as the 
average 

T = K l od E a £ A R TA Q + n go l od E a £ A R TA Q , 

Q,Re@ a Q,Re@ a 
l(Q)<l(R) i(R)<KQ) 
Q is good R is good 

here and below in this section the averages are understood in the weak sense, as equalities of the 
bilinear forms for f,g G Cq. As it was explained before in the proof of Lemma 14731 see Remark [4741 
there, in this case we can freely interchange the summation and expectation (integration) E^. 
Recalling the decomposition 

r = f a) +n»+(n»)*, 

and using the fact that for Q,R £ 3> a 

A s n?A e = o, A G (n»)*A R = o 

if £(Q) < £(R), we can write 

(4-8) T = n go l d E a £ A R f a A Q + ^ ood E a £ A R f a A 

Q,Re® a Q,Re@ a 

e(Q)<e(R) i{R)<t{Q) 

Q is good R is good 

+ I \(n?.)\ + iE fl £ A R HfAg. 

Q,Re2> m Q.Re^ 
t(Q)<£(R) W)<t(Q) 

Q is good R is good 

Lemma 4.8. For the paraproducts Ylj 

E a £ A R U'?A Q =E £1 £ A R n*A Q = 7i good E a n? 

Q,Re$) a Q,ReS) a 
l{R)<l{Q) t(R)<t(Q) 

R is good R is good 

Proof. It is not hard to see from the definition of the paraproduct that for /Gi 2 

Q,Re@m Q,Re$>a, ReS> m 

l(R)<l(Q) t{R)<t{Q) « is good 

R is good R is good 

Applying E^ we get that 

En £ (A R Tl)E R f = £E a E* k £ (A R Tl)(E R f)l R (a) 

Re@ a kez Re9 a 

R is good l{R)=2 k 

= %ood£E a £ (A R Tl)E R f = 7r good E fi n£/; 

keZ ReS> a 

e(R)=2 k 

here we again used the fact that by (14.11 ) E% k l R is good (a>) = 7t good for R£ £(R) = 2 k . □ 

By Lemmal4~8lthe second line in (|4~8T) is Ea(LT® + (Uj, )*). We know that the paraproducts LT® and 
(IT®,)* are (up to a constant factor C = C(a,d,C cz , \\T\\)) generalized dyadic shifts with parameters 
0, 1 and 1 , respectively. 
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So to prove the theorem we need to represent the first line in ( 14.81 ) as the average of dyadic shifts. 
Let us represent the first term. For m,n £ Z + , m > n, define the dyadic shifts „ as 

M6f a Q,Re9a,:Q,RGM C V yl J 

e(Q)=2- m £(M),l(R)=2-"e(M) 
Q is good 

where 

n(Q\R) = P{<2 is good)/?} = E a[R] l Q h good 
(note that l(Q) < i(R)). The weights p 2,7? £ are defined by 

D(Q,R) d+a 

( 4 - 9 ) P G , s :=En[R] E /rM w+« • 1 eis g ood( ft) ) ; 

note that in the above expression we assume (can assume) that the variables (Oj, j < k, determining 
the position of R (and so of Q) are fixed. 

Remark 4.9. In general, Pq R can be zero. However, it is not hard to see that p^ R > if 7i(Q\R) > 0, 
so the dyadic shifts „ are well defined. 

Averaging we get 

En £ 2 -(m+n)a/2^ n 

D(Q,R 



m,n&Z,:m>n 

d+a 



t{Q)<t{R) Q.RcM 
7i(Q\R)^0 



~ D(0 R) d+a 

= Ea I E^e^.p^.A^A £ ^) .i (<p) 

^(2)<^(R) Q,i?cM 
7r(Q|R)^0 

and recalling the definition of p^ s we conclude 



m,nEZ:m>n Q,R^'~ 

e(Q)<e(R) 



On the other hand 



£ A R r ffl A e = £EnE % £ i e . sgood ( w ).A s r w A e 

Q,Re9 a keZ Q,Re3> a 

l{Q)<t{R) e{Q)<£(R)=2 k 

Q is good 

= L E ^ L (^i^isgood)^^^ 

feZ Q,Re9 m 

= E a £ ttCGI^A^T^A 

t{Q)<t{R) 
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SO 

Eq £ 2-("'+")«/ 2 §-, ! =E n £ A R f a A Q . 

m,neZ:m>n Q.Re&io 

i{Q)<i{R) 
Q is good 

It now remains to show that „ are (up to a constant factor) are the dyadic shifts. The operators 
S®„ have the appropriate structure, so we only need to prove the estimates, i.e. to prove that the 
weights Pq r are uniformly bounded away from 0. The necessary estimate follows from Lemma l4.10l 
below. 

So, we have decomposed the first term in (14.81 ) as the average of dyadic shifts. The decomposition 
of the second term is carried out similarly, so Theorem l4.1l is proved (modulo Lemma |4. 10b . □ 

Lemma 4.10. Let Q,R G @ a , £{Q) < l(R). Then 

(i) ft{Q\R) > if and only if Q is "good up to the level ofR", meaning that 

(4.10) dist(Q,Q')>l(Qyi(Q') l -v G ® a : 2 r H{Q) < £(Q') < £(R); 

note that the cubes Q' do not depend on the variables (Oj, j > k where 2 = £(R). 

(ii) There exists a constant c = c(d, ro, y) such that 

p QR >c(d,r Q ) VQ,Re$! m : n(Q\R)^0. 

Proof. We want to estimate conditional probability end expectation with R and Q fixed. That means 
the lattice up to the level of R is fixed, so nothing changes if we replace R by a cube in the same level. 
So, without loss of generality we can assume that Q C R. 

Let us first consider a special case. Let £(R) = £(Q)2 S , where 

(4.11) s>2/r+r -{i-r)/r, 

and let 

&&t(£,dR)>±£(R). 

Then the estimate (14.1 II) implies that 

l{QY[2 r H{R)] = 2-^2 ro{l -^k{R) < ^£(R), 

meaning that for any cube M G @ a , £(R) < £(M) < 2 r °£(R) (assuming that the lattice @ a is fixed up 
to the level of R) 

£{Q) y £{Mf-y < ht(R) < dist(<2,d#) 

(4.12) <dist(Q,dM). 

On the other hand, if £(M) > 2 r °£(R) and the pair R, M is good, meaning that 

dist(/?,dM) >£{R) y i{Mf- y 

then 

(4.13) dist(2,5M) > £{QY£{M) y - y , 

so the pair Q, M is also good. 

Therefore, if the cube R is good, then Q is good as well: as we just discussed, the inequality ( 14.131 ) 
holds if £(M) > 2 r H{R), and it holds for £{R) < £{M) < 2 r H(R) by (14121 . And the assumption d4~T0T) 
covers the remaining cases. 

So, in our special case 7i(Q\R) > 7r g0 od- 
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The general case can be easily reduced to this special situation. Namely, if Q C R, then with 
probability at least 2~ d the parent R of R satisfies 

dist(Q,d£) > -£(R); 

one can easily see that for d = 1 , and considering the coordinates independently, one gets the conclu- 
sion. 

Applying this procedure sq — I times, where so is the smallest integer satisfying (14.111) . we arrive 
(with probability at least 2~( So ~ 1 ' d ) to the special situation we just discussed. Therefore for Q C R 
(equivalently £(Q) < £(R)) statement (i) is proved with the estimate 

(4.14) n{Q\R) > 2- {s °- l)d n gooA =: icq. 

Finally, if Q = R, we with probability 1 arrive to the previous situation, so the statement (i) is now 
completely proved with estimate ( 14.141 ). 

The statement (ii) is now easy. First note, that if x G Z is such that 2 T > D(Q,R), then 

(4.15) P{3M G £$co '. £(M) = 2 T , Q,RcM\R}> 1 - d • 2D(Q,R)/2 T . 

Indeed, in one dimension the probability that such M does not exists can be estimated above by 
2D(Q,R) /2 T , so to get the estimate of non existence in M rf we can just multiply it by d. The extra fac- 
tor 2 appears in one dimensional case because M cannot be moved continuously, but only in multiples 
of£(R). 
Define 

T ( ,:= [log 2 (dD(Q,R)/7i {) )\+3, 

so 

d-2D{Q,R)/2^ < K Q /2. 

Comparing the estimates (14.141 ) and (14.151 ) of probabilities, we can get that for fixed Q and R the 
probability that Q is good and that Q,R C M for some M G @ w , £{M) = 2 T °, is at least k /2. 
On the other hand, the definition of T implies that £(M) = 2 T ° < 8 • d ■ D(Q,R) /tiq, so 

D(Q,R)/£(M) > «o/8. 

Therefore, the contribution to the sum (14.91) denning R of the term with such M alone is at least 

(vsr'+^oA 

That proves (ii) and so the lemma. □ 

5. Sharp weighted estimate of dyadic shifts 

Recall, that for a dyadic shift S with parameters m and n its complexity is r := max(m,n). In this 
section we assume that a dyadic lattice & is fixed. Let S be an elementary (possibly generalized) 
dyadic shift 

(5.1) S/(x)= £ f a Jx,y)f(y)dy 

where are supported on Q x Q, ||a H,,, < \Q\~ l (in this section we will incorporate \Q\~ l into a^). 
Let srf C £F be a collection of dyadic cubes. Define the restricted dyadic shift by taking the sum 
in d5.ll ) only over Q G '. 

As it was shown by Theorem 14. 1 1 that a Calderon-Zygmund operator T is a weighted average of 
dyadic shifts with exponentially (in complexity of shifts) decaying weights, to prove Theorem II. H it 
is sufficient to get an estimate of the norm of dyadic shifts which is polynomial in complexity. The 
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following theorem, indeed, achieves a norm bound which is quadratic in complexity. This is the sec- 
ond new main result of this paper and represents a substantial quantitative improvement over earlier 
sharp weighted bounds for dyadic shifts |[T7l l4l. which were exponential in complexity. Note that the 
paper lfT2l . while using dyadic shifts as auxiliary operators in the original proof of Theorem 1 cir- 
cumvented the question of actually estimating their norm. This is achieved in |[T2l by going through 
the test conditions of rather involved paper OBI . 

Theorem 5.1. Let § be an elementary (possibly generalized) dyadic shift of complexity r in M. d , such 
that all restricted shifts S^, are uniformly bounded in L 2 

(5.2) sup ||S || =:S 2 = S <oo. 

s/<z9 

Then for any Aj weight w 

(5.3) ||S/|| l2(hi) <C2 M / 2 (r+l) 2 (fi 2 +l) MJI/II^, V/GL 2 (w) 
where C is an absolute constant. 

Note that for dyadic shifts we are considering (that is non-generalized dyadic shifts and para- 
products), the assumption about uniform boundedness of § is satisfied automatically. Namely, any 
non-generalized dyadic shift is a contraction in L 2 , so (15.21 ) holds with B = 1. It is also easy to see 
that for the paraproducts ||S^|| l2 ^ l2 < ll§>ll L 2^ L 2- 



The estimate (15.31 ) with C depending exponentially on r was proved (for non-generalized dyadic 
shifts) in iTTTl . However, careful analysis of proofs there allows (after some modifications) to obtain 
polynomial estimates. 

Compared to 0/7 J, the main new ingredients here are: 

• The sharp two weight estimate of Haar shifts, see above Theorem 13.41 which is essentially 
the main result of |[25l (with the additional assumptions about "size" of the operator), with 
the dependence of the estimates on all parameters spelled out. 

• Proposition 5.1 of lfl2l . reproduced as Theorem 15 . 2l below. which gives linear in complexity 
of § estimate of the unweighted weak L 1 norm of §; the corresponding estimate in iTTTl was 
exponential in complexity. 

Replacing / in (15.31 ) by fw~ l and noticing that ||/w _1 1| , = ||/|| , _ u we can rewrite it as 
(5.4) llSCfw" 1 )!!^ < C2^ 2 (r+ l) 2 (B 2 + l) [w] M WfW^ , V/ G L 2 (w~ l ), 

so we are in the settings of Theorem 13.41 with d\l = w~ l dx, dv = wdx. By Theorem 13.41 to prove 
estimate (15.41) is is sufficient to show that 



J \§{l Q w- l )\ 2 wdx < B[w] 2 A w-\Q), \/f e L 2 {w- 1 ) 

(5.5) jjB>{l Q w)\ 2 w- x dx<B[w] 2 A w{Q), V/GL 2 (w) 

where 

B l / 2 = C2 d (r+l)(B 2 + l) 

with an absolute constant C. 

Since [w -1 ] ^ = [w] A , one can get one estimate from the other by replacing w by w~ l . Thus, to 
prove Theorem ]5.1l and so the main result (Theorem 11.11 ) we only need to prove one of the above 
estimates, for example (15.51) . 

The rest of the section is devoted to proving (15.51 ) 
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5.1. Weak type estimates for dyadic shifts. Let ||§||2 be a shorthand for 2 2 . We say that 
a shift S has scales separated by r levels, if all cubes Q with ag ^ in d5.ll ) satisfy log 9 ^(<2) = j 
mod r for some fixed j G {0, 1, . . . ,r — 1}. 

The following result reproduces Proposition 5.1 of lfl2l with an additional observation concerning 
shifts which have their scales separated. This seemingly technical variant allows us to obtain the 
asserted quadratic, rather than cubic, dependence on complexity in Theorem 15. II 

Theorem 5.2. Let § be a generalized elementary dyadic shift with parameters m,n. Then § has weak 
type 1-1 with the estimate 

(5.6) ||S|| j lf . <C(d,m,\\§\\ 2 ) = 2 d+2 \\§\\l + l+4m, 

meaning that for all f € L l and for all X > 

|{,:|S/ W |>A(|<S^^1?M||/||, 
IfS has scales separated by r>m levels, then we have the improved estimate 

L i,~ <C(rf,l,||S|| 2 ) = 2 rf+2 ||S|| 2 ! + 5. 



Il^llz. 1 

Proof. Our shift § can be written (see ( 13.21 )) as 



§/(*)= L / a Q ( x iy)f(y) d yi 



where a^ is supported on Q x Q and \\Uq ||oo < \Q\ 1 (we incorporated the factor \Q\ 1 from (13.21 ) 
into cig here). It follows from the representation (13.31 ) of a^ that for fixed x the function a^(x, ■ ) is 
constant on cubes Q' e 9, l(Q') < 2~ m l(Q). 

To estimate its weak norm we use the standard Calderon-Zygmund decomposition at height A > 
with respect to the dyadic lattice £F. Namely, as it is well known, see for example iTTTl p. 286], given 
/Gi 1 there exists a decomposition f = g + b,b = Y,qe£> > where J3 C & is a collection of disjoint 
dyadic cubes, such that 

(i) IMIi <IL/1li,IMU<2 rf A. 

(ii) Each function bQ is supported on a cube Q and 



|&J|i<2-||l G /||i, f b dx = 0. 



(iii) L Q e£>\Q\<^ l \\f\\i- 
The property (i) of the Calderon-Zygmund decomposition implies that 

(5-7) ll/lll< 2^||/|| i 

As usual, we can estimate 

\{x : S/(jc)| > X}\ < \{x : \Sg(x)\ > X/2}\ + \{x : \Sb(x)\ > X/2}\ 

(one of the two terms should be at least half of the sum). The measure of the first set is estimated 
using the boundedness of S in L 2 

\{x : |S*(*)| > X/2}\ < P\\l\\g\\ 2 2 ^ < ||S||l^||/||i, 
where ||§||2 is the shorthand for 2 2 ; we used (15.7b to get the second inequality. 
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\§b Q (x)\ < Y, a R ( x ^y) b Q (y) d y 



+ 



I 

Re$):RcQ JR 



a R (x,y)b Q (y)dy 



RE%>:Q^R 

Therefore, summing in Q G J2, we get 

\Sb{x)\<Y £ a R (x,y)b(y)dy + £ £ a R (x,y)b Q {y)dy 
QeJ2Re@-.Q<^R Jr Qe3 Re^:RcQ J R 

=:A(x)+B(x). 

Hence, using again the fact that one of the two terms should at least a half of the sum, we can estimate 

\{x: \Sb(x)\ >X/2}\ < \{x:A(x)>X/2}\ + \{x:B(x)>0}\. 

The second set is obviously inside Uq^sQ'- indeed the function B(x) vanishes outside this set be- 
cause a R (x,y) = for all x R, and R C Q. So, using the property (iii) of the Calderon-Zygmund 
decomposition, we can estimate the measure of the second set as 



|{*:B(*)>0}|< £ |fi|< jll/Hi. 



<C||/||i,then cleai-ly 



To estimate the first measure we want to show that 
(5.8) 

We will estimate the norm of each term in A separately. Let us fix Q G £2 and let us consider 



|{x:A(x)>A/2}|<|||A|| 1 <^||/|| 1 . 



a R {x,y)b (y)dy 



R€@,Q^R 

Since the function b^ is orthogonal to constants, and the function a R (x, • ) is constant on cubes Q G 
£(Q) < 2~ m £(R), we can see that the only cubes R which may contribute to Aq are the ancestors of Q 
of orders 1, . . . ,m. So, in general, there are at most m non-zero terms in Aq, if § has scales separated 
by r > m levels, there is at most one. 

Recalling that for an integral operator T with kernel K 



esssup||#(-,y)||i, 



we can see that the integral operator with kernel a R is a contraction in L . Since at most m such 
operators contribute to Aq, 

I|a g ||i < Hl^ e lli < 2m IIWIIi; 

the last inequality here holds because of property (ii) of the Calderon-Zygmund decomposition. 
Summing over all Q G £2 we get 

l<2m£ ||l G /|| 1 <2m||/|| 1 . 
Qe£> 



\{x:A(x)>X/2}\<^\\A\\ l < 4 



so (see CT8J0 



A 111 " X 

If § has scales separated by r > m levels, we can take 1 in place of m in the last few estimates. □ 
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Using this improved weak type estimate one can get the desired estimate ( 15.51 ) by following the 
proof in iTTTl and keeping track of the constants. However, there are several other places in IflTl . where 
the curse of exponentiality appears. So for the convenience of the reader, we are doing all necessary 
estimates below. Note that an analogous modification of lfT7l was already carried out in [12]; here 
we present yet another argument in the spirit [17] but with modifications pertinent to eliminating the 
curse of exponentiality. 

5.2. First slicings. Let us fix Qo G S, and let us prove estimate (15.51 ) for Q = Qo- Recall, that § is an 
integral operator with kernel Lg e #ag(*,.y)> where a as in the previous section is incorporated 

ina e ). 



Define 



so 



f Q ( x ) ■= / a Q (x,y)w(y)dy, 
J Qo 



Qe&:QnQ o ^0 



We can split / into "inner" and "outer" parts, 

/= I f Q + I 4= : /i+/o 

QeS>:QcQ QeS>:Q„^Q 
The "outer" part f Q is easy to estimate. Since ||a^ (jc, • )|| 00 < \Q\~~ , we can write for Qo ^ Q 

\f Q {x)\<w{Qo)\Q\- 1 

and summing over all Q, Qo^Q 

oo 

\fo(x)\ < \Qo\- l w{Qo) £ \Q\~ 1 \Qo\ = |Go|" 1 w(j2o) £2- M <|e r 1 w(!2o). 
Therefore, 

[ \fo\ 2 w- 1 < \Q r 2 w(Qo) 2 w- l (Qo) < H A MQo), 
JQo 

so III^/oII^-in < My 2 MGo) 1/2 , and it only remains to estimate H/iH^-i). 

Now we perform the first splitting. Let r be the complexity of the shift S. Let us split the lattice 

& into r + 1 lattices *3sl, j = 0, 1, . .. ,r, where each lattice £F/ consists of the cubes Q G Si of size 
2 ;-(r+i)T ) T£Z _ 

If we can show that uniformly in j 
(5-9) I I £ f Q \ 2 w- l <C2 2d {B 2 2 + l) 2 [w] 2 A w(Qo), 

where C is an absolute constant, then we are done. Indeed taking the sum over all j = 0, 1, . . . ,r 
we only multiply the estimate of the norm by r+ 1, so to get from the estimate ( 15.91 ) to the desired 
estimate (15.51 ) we just need to multiply the right side of ( 15.91 ) by (r+ l) 2 . 

The main reason for the this splitting of S is that it simplifies the structure meaning that for Q £ S 3 r 
the function / is constant on the children of Q in the lattice S ] r . Also note that the shift S J /(x) := 
Y,Q e @j fQ a Q( x ^y)f(y)dy has scales separated by r + 1 > m levels, and l^j(Q) ■ /q = § ; (lg w). 

Let us fix j, and let us from now on consider the lattice S) r Since j is not important in what 

follows, we will skip it and use the notation Q> r , freeing the symbol j for use in a different context. 
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We also denote S ; simply by §, bearing in mind the separation of scales which allows the use of the 
sharper estimate in the weak-type bound of Theorem 15.21 

Now we split the lattice Ql r into the collections k G Z + , k < log 2 ([w] A ^ ), where each J3k is the 
set of all cubes Q G Q r such that 



(5.10) 

We want to show that 
(5.11) 



2k< wiO) w~\Q) <2k+l 



IGI IGI 



2d 



fr. 



QeJ2 k :QGQa 



w- 1 <Ci2>] w(go) 



where C\ = Cl^ (B\ + l) 2 is the constant in the right side of ( I5.9I ). Then, using triangle inequality 
and summing the geometric progression we get 



<C 



1/2 M 1/2 



L 2 {w~ 



A 2 



2 k/2 w(Qo)<4c\ /2 [w] A w(Q ), 



>A 2 > 



so d57TTb implies that $5M holds with C = 16Ci. 

So, we reduced the main result to the estimate ( 15.1 II ) with C\ = C2 2d (B\ + l) 2 . Note, that if we 
prove (15.1 II ) for Qq G then we are done, because for general Qq we can add up the estimate for 
maximal subcubes of Qq belonging to =2^. 



5.3. Stopping moments and Corona decomposition. Let us suppose that the weight w and the 
lattices @ r and £2 = C £F r described above are fixed. 

Given a cube g G =S = =2* let us construct the generations Sf* =^ T *(6o) = ^r*(Go,w,=2), t G Z+ 
of stopping cubes as follows. Define the initial generation §f * to be the cube Qq. 

For all cubes Q G £^ T * we consider maximal cubes Q' G =2, Q' C 2 sucn that 

w(e') >/1 w(g). 



lei 



IGI ' 



the collection of all such cubes Q' is the next generation £f * +1 of the stopping cubes. 
Let = ^*(<2o) : = U T >o^ T * be the collection of all stopping cubes. 

Note, that if we start constructing stopping moments from a cube Q G 5f*, the stopping moments 
^*(<2) will agree with meaning that 

&*(Q) = {Q' e&*:Q' CQ}. 

Let us introduce the last piece of notation. For a cube gG^* let us define J2(Q) '■= {Q' G £2 : 
Q' CQ}, and let 

^(fi):=^02)\ U ^(GO- 

The above definitions make sense for arbitrary Q G =2, but we will use it only for g G so we 
included this assumption in the definition. Note that for Qq G J2 the set =2(<2o) admits the following 
disjoint decomposition 



(5.12) 



£(Qo) = (J &(Q) 

ee^(eo) 
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5.3.1. Properties of stopping moments. It follows from the construction of that if R G and Q 
is a maximal cube in 9?* such that R, then 



(5.13) 

The estimate (15.131 ) implies 
(5.14) 



w(Q) 4 w(R) 



\Q\ 



R 



, , \R\ w(Q) 
\Q\<-r- 



4 w{R) 

and summing over all such maximal Q G Q^R (assume that R G 5f T *) we get 



(5.15) 



U Q 

Qe'S>:Q<gR 



for all rteSf*. 

Repeating this estimate for each Q and summing over the generations we get 

I iei<i*il4- n = V 

Adding \R\ to this sum we get that the following Carleson property of the stopping moments 
(5-16) £ \Q\<±\R\. 

Qetf'.QcR 3 

It is easy to see that this estimate holds for all R G not just for R G £f * : one just needs to consider 
maximal cubes R' G R' C R and apply (15.161 ) to each of these cubes. 
Iterating (15.151 ) and summing over all generations we get 



(5.17) 

' Qetf*,QcR 

We need the following simple lemma 
Lemma 5.3. For any R G 3> 
(5.18) 



£=0 



i/2 



£ w(<2) < C[w] A2 w(#) 



where C is an absolute constant. 



Proof. The Carleson Embedding Theorem (see Theorem 13.61 above) applied to \ R together with the 
Carleson property (15.161) imply that 



Qe&*,QcR \ J Q 
(the best constant is C = 4 -4/3). But 



I (-f™ l/2 ) \Q\<C\\l R w l/2 \\l = Cw(R). 
W.OcR \JQ / 



w 



1/2 



tr 



w 

-1/2 



1/2 



by Cauchy-Schwartz 

because (^j- w^j w -1 ^ < [ 



A 2 ' 



A 2 CONJECTURE 



29 



SO 

2 



w < [w] Al w 1 / 2 

and the lemma is proved (with C = 16/3). □ 

This proof was (essentially) present in |[39l . In ifPTI a different proof, using a clever iteration 
argument and giving the better constant C = 16/9, was presented. 

5.4. John-Nirenberg type estimates. Given a collection srf of cubes, si C S) r , define the function 

$$£ '~ 2- fa" 



For the cube cube On G ^* consider the function f„,^ , . By (15.121) the function f ru „ , can be decom- 

»S(2o) J -S(2o) 

posed as 

where recall ^* := ^*(<2o) is the collection of stopping cubes. 

The main reason for introducing this decomposition is that, as we will show below, the functions 
/„, . behave in many respects as BMO functions: they have exponentially decaying distribution 
functions, so, in particular all LP norms for p < oo are equivalent. 

In the proof of these facts the weak L 1 estimate of dyadic shifts (Theorem 15.2b is used. 

The first lemma, which is Lemma 3. 15 in IfTTl . is a simple observation, that for the John-Nirenberg 
estimates of the distribution function it is sufficient to have weak type estimates. 

Recall that % is 2 r -adic lattice, i.e. the children Q' of Q satisfy £(Q') = 2~ r l{Q). 

Definition 5.4. Let 0^, Q G & r be a collection of functions such that 0^ is supported on Q and is 

constant on children (in £F r ) of Q. For Ro G t 3s r let <h* be a maximal function 

Ro 

CW : = sup £ 

Qe$>,:Q3x ReS)l -Q<Z RcRo 

Lemma 5.5. Let 0^, Q G Ql r be a collection of functions such that 

(i) is supported on Q and constant on the children ( in 3> r ) of Q; 

(ii) Halloo <1; 

(hi) There exists 8 G (0, 1) such that for all cubes R G @ r 

f [X eR:f R (x) > l}| < S\R\. 

Then for all R G & r and for allt>0 

\{x£R:f R (x)>t}\<S('- l V 2 \R\. 

Proof. Let us prove the conclusion of the lemma for a fixed cube /? = /?o6 3) r . 
Let SB\ be the collection of all maximal cubes Q G & r , Q C Ro such that 



(5.20) 



> 1, JcGg; 



note that the functions 0„ (and so the sum) are constant on the cube Q. 
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Define the set Bi, 

Si := U Q- 



It follows from the construction that * < 1 outside of B\, and that for any Q G 3§\ the sum in (15.201 ) 
is at most 2. Note also that by the assumption (iii) we have that \B\ \ < 8\Ro\. 

For each cube R G SS\ we repeat the above construction (with R instead of Ro)', we will get a 
collection of stopping cubes g$ 2 an d the set B 2 = ^Qe3S 2 Q> B 2 C B\, \B 2 \ < 5 2 |/?o|- It is easy to see 
that 0* < 2 + 1 = 3 outside of B 2 and that for any cube Q G 83 2 



E <t> R (x) 

Re%:Q^RcR 



<4, xeQ 



(sums outside of 7? G 3$\ contribute at most 2, and the sums starting at R G £%\ contribute at most 1 
outside of B 2 and at most 2 on 2 G £% 2 . 

Repeating this procedure we get the collections 2S n of "stopping cubes" and the decreasing se- 
quence of sets B„ = L)Q E ^ n Q, such that 

(5.21) \B n \ < 8"; 

(5.22) (b* <2n-l outside of B„; 

Ro 



Re2> r --Q^RcR a 



<2n V<2 G Vx G <2; 



the last inequality is only needed for the inductive construction. 
Given t > 1 let n be the largest integer such that In — 1 < t, 

«=L(f+i)/2J. 

Bv (15321) 

4>* o <2/2-l<? yx£B n , 

so 

|{jce* :^(jc)>f}| < |fl„| < 5" < S^ 1 )/ 2 . 
This completes the proof for t > 1, but for < ? < 1 the conclusion is trivial. □ 

As it was shown above in Theorem l5.2l the weak L 1 norm of a dyadic shift § of complexity r, with 
scales separated by r + 1 levels, can be estimated by C = 2 d+2 ||S||2 + 5, so the weak L 1 norm of our 
dyadic shift § and all its subshifts S stf C 3i r , can be estimated by 

(5.23) B l= 2 d+2 Bl + 5, 

where 

Bi= sup \\SJ\ 

s/<z<2> 

Now we need the following lemma, which is essentially Lemma 4.7 from flTTl with all constant 
written down; in fact, certain modifications in the argument are needed to avoid introducing expo- 
nential dependence on r, which was (implicitly) the case in ifTTl . Such a modification (with linear 
dependence on r) was first obtained in Lemma 7.2 of lfT2l : here we even achieve an estimate uni- 
form with respect to r by taking into account the separation of scales of our shift, and the resulting 
improvement in the estimate of Theorem l5.2l 
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Let 8? C be a collection of cubes. Define the maximal function (compare with Definition 



El by 
(5.24) 



I /*(*) 



For the function fgg, Ro y £ ^* defined above in the beginning of Section l5~4l we have < 

f* , so we will use f* to estimate the distribution function of I f„,„ J. 

Note that for R G ^ we cannot guarantee that its children in f2? r are in 2?. So while in the above 
definition the sums are taken over all R G we need to take supremum over Q G 3> r - 

Lemma 5.6. Let B\ is given by ( 15.231 ). Then for any R G ^* we have 



(5.25) 
(5.26) 



w(R) 

xeR: f* (x) > 16? ^ 

&>(Ry ' \R\ 



<2V2-2- t/2B[ \R\ 



w 



-l 



v If-!, R W > 20^ } ) < 24 • 2-/ 2 ^ w- 1 (*) , 



Proof. Now it is time to perform the last splitting. Namely, let us split the set 3? (R) into the sets 
^a(R), oc G Z+, where the collection 0P a = ^ a (R) consists of all cubes Q G B^(R) for which 



(5.27) 



4 _ a vv(g) vv(g) <rtt+1 w(j?) 



1*1 IGI 

Note, that by the construction of stopping moments 



w(G) <4 _w(/?) 



IGI 



so we do not need a < 0. 
We can estimate 



/■* < y f* 



Now let us estimate the level sets of f* using the above Lemma [531 For Q G & a (R) 



IGI 



R 



(recall that Si > 1). 
To this end, let 



2 2a ~ 3 |/?| 



so that ())q satisfies the first two assumptions of Lemma [531 

Recall the notation <j>g from Definition 15.41 We want to use the weak type estimate for shifts to 
estimate the size of the set 

Observe that this set is the union of the maximal cubes Mef, such that 



I teto 



M:M^Q<ZR\ 



> 1 



for x G M. Let ^ stand for the collection of these maximal cubes, and let 

JY := {Q G % : Q C Rr, flM G J(,Q C M}. 
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Then 



where 



[x€R i :<& 1 >l} = {xeR 1 : | £ <j> Q \ > l} 



)2a-3 



2 2o: ~ 3 |P| 



hence, by Theorem I5T21 and = w(/?i), 



{xe^i^l} 



Biw{R) 



If i?i e £P a (R), then the right side is directly dominated by 2 2a ~ 3 -4 _o:+1 |/?i| = j\Ri\. For an arbitrary 
Ri € @ r , observe that 0jj = £p0p, where the summation ranges over the maximal P £ & a (R) with 
P C R\. Since supp 0p C P, and these cubes are disjoint, it follows that 



Observing that 



{xeRnfc >l}|=l|{^€P:^>l}|<£i|P|<i|Pi|. 

2 2 «- 3 |P| 



Lemma 1531 implies that 

{x e P : > t^2-^t] \ = \{xeR:$>t}\< 2-^\R\ . 



Rescaling t we can rewrite the inequality as 
(5.28) \{xER:f^ {R) (x) > 1&^}|< V^"/*,*, 

Denote the set above as E a (t), 

E a ( t ):={^R:f k{R) ( X )>l6 t ^}. 

We want to estimate the set where 

CO 

If this happens for x G R, then either (x) >T/2, or 

oo 

EW )>r/2 - 



V? >0. 



ce=l 



The latter inequality implies that either (x) > T/4 or 

oo 

and so on. 

Repeating this reasoning with T = l6w(R)t /\R\, we can see that 

w(R) 



I I a>0 
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so using ( 15.281 ) we get 

\R\- 1 1 \x G R : f (x) > I6t^$\ I < V2 E 2^^'/^ 

oo 

< V2£2- f / 2B '- a iff > 2Si 

a=0 

<2^2-2-'/ 2Bl 

which proves (15.251) . We have proved (15.251 ) for t > 25 1, but for t <2B\ this estimate is trivial, 
because the right side is greater than |/?|. Thus, (15.251 ) holds for all t > 0. 

To prove (15.261 ). let us first recall that all our cubes are in J3 = so (15.101 ) holds for all of them. 
If, in addition Q G & a (R), then ( 15.271 ) (the definition of £? a (R)) is satisfied, and combining these 
two estimates we get 

(5.29) 2 * 4 °"'^It ^ ^ ^"^tIt V «2 G ^a(*)- 

w(R) \Q\ w(R) 

So w _1 (2) can be estimated via \Q\, so we will use the known estimates of the Lebesgue measure of 
level sets to get the estimates of the w measure. 
Let us consider the set where 

J ^a(Ry 1 \R\ 

This set is a disjoint union of cubes Q' G *2) r , which are the first (maximal) cubes Q for which the sum 
in (15.241 ) defining f* exceeds 20t ■ w(R)/\R\. Unfortunately the cubes Q' are not necessarily in 

& 'a (R) 

& a {R), so we cannot use (15.291 ) for them. But their parents are in & a {R) (because the summation is 
over &> a (R)y. 

So, let $ a {t) be the collection of such parents, and let 

E a (t):= |J Q. 

Qe<S a (t) 

Note, that to get E a {t) it is sufficient to take the union of the maximal cubes Q G S a {t), so the set 
E a (t) is a disjoint union of cubes Q C & a {R). Since for Q G & a (R) 

l4WI<#<4-" +1 #, 
Q ~ \Q\ ~ \R\ 

we can conclude that for all Q G S a {t) and all t > 4~ a 



™ w(R) A „w(R) „ w(R) 

>20f-V-4-4" a -V > 16? -rV Vise 
R R ~ R 



I /*(*) 

(because the corresponding sum for one of the children Q' of Q exceeds 20? • w(R) /\R\ on Q', and the 
difference between the two sums is f n ; we also use that the sum in the left hand side is constant on 

G). 

So /* Ax)> 16? • w(R)/\R\ on Q, and we conclude that for ? > 4~ a the inclusion E a (t) C E a (t) 
holds. Using the estimate (15.28b for ( an< l replacing y/2 by 2 there) we get that for ? > 4~ a 

(5.30) \E a (t)\ <2-2-' 4 ^ Bl \R\. 

Note that for ? < 4~ a the above estimate is trivial, so it holds for all ? > 0. 
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Since by ( 15291 for all Q £ &> a (R) 



w - l (Q)<2 k+1 4 a 



\R\ 



w(R) 



\Q\ 



summing over maximal cubes in S a {i) we get 



w-\E a (t))<2 h+l A a 



\R\ 



w(R) 
\R 



\Ea(t)\ 



<2 k+U a -^-2-2- t * a / B i\R\ 
w(R) 

(5.31) < 4 a -2 2 -2- t4 ° '■ /B> W -\R) 

Now we want to estimate w _1 (£(*)), where 

£(0 :={*€*: £(*)>2Qf^}. 



bv (15301) 
bv (IBTTOb 



Let T := 20t ■ w(R) /R. If for x £ R 



LW* )>r ' 



a=0 



then either (x) > T /2 (in which case x E Eo{t/2)) or 

oo 

V f* , Ax)>T/2. 
If the latter inequality holds, then either f* (x) > T/4, sox € Eo(t/4), or 

irx (a) 



V f* (x) > T/4. 



a=2 



Repeating this reasoning we get that 



E(t)c \jE a (t2- a - 1 ) 



a>0 



SO 



W 



-\E(t))<£ w -\E a (t2- a - 1 )) 



a=0 



<4w-\R)Y,4 a 2-' 2a ~ l/Bl 

<4 W - l {R)-6-2- t l 2Bl 
To prove the last inequality we need for t > 2B\ to estimate the sum 

oo 

^ 2 2 «-' 2< 7 2B i 



by (15311) 

if t >2Bi. 



a=Q 



Since 2 a > 3 a + 2 for a > 4, we can estimate for a > 4 and t>2B\ 
2a - t2 a /2Bi < 2a - 1 ■ {3a + 2) /2B X 

= (2a-2a?/2Bi) -at/2B 1 -2t/2B 1 
<0-a-t/2B u 
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SO 



OO CO 1 



a=4 a=4 
For a = 0, 1 , 2, 3 we can estimate 



2 2«-,2<72 Bl < Ca2 -t/2B^ where Co = l ci=C2=2c3= l ) 

so adding everything we get that 

W -\E{t)) <2A- 2-' l 2B w-\R). 

We proved that estimate for t > 2B\, but for t < 2B\ the estimate is trivial because the right side is 
bigger than w~ ! (R). So the estimate holds for all t > 0. □ 

5.5. Conclusion of the proof. 

Lemma 5.7. For any R G 

<"» \\f nR) ^<c lB A\y\ 



(533) "WW-^CiB^y^W, 
where C\ and C2 are absolute constants and B\ is given by (15.231 ) 

This lemma is proved by using the distributional inequalities from Lemma [331 and computing the 
norms using distribution functions. That will give the desired estimates for the norms of the maximal 
function f* , and since I f„ /T1 , 1 < f* , we get the conclusion of the lemma. We leave the details 

as a trivial exercise for the reader. 

Recall, that to prove the main result we need to prove estimate (15.1 II ) for all cubes <2o £ = &k- 
For a cube Q £ B, let £{Q) := {Q' £ JS : Q' C Q}. We want to estimate \\f \\ Q Q £ &, 

where 



Since (see S5A9fo 



we can write 



Qe£?(Qo) 



QeS?*(Q ) 



v ; Re%*(Q ) v ; R,G6^*(2o):egR 

= Si+S 2 . 

The first sum is easy to estimate. By ( 15.331 ) 

< [C x Bif2 k+l w{R). because /? e & = £ k 

Summing over all R = ^*(Qo) we get using (15.181 ) 

Si < 2[CyB l ] 2 2 k £ w(R)<CB 2 1 2 k [w] A w(Q ), 

where C is an absolute constant. 
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Let us now estimate S2. 

Let Q,R G , Q ^ R. Then f^,,g, (x) is constant on Q, let us use the symbol f^ ( ^ (Q) to denote 
this constant. We then can estimate 



>- l (Q)) l/2 U 



<Ci5 1 |/^ ) (j2)| 



Ifil 



(5.34) 

Using this estimate we can write 



^CxB^f^Q)^ 1 -\Q\ 



J 2?(R) ' 

by Cauchy-Schwartz 

bv (15331 
because g£ Jj. 



Qetf*:QgR 

<2 i+1 c 1 B 1 £ IW<2)H0| 

qe&*-.qcr 



bv (15341 



= 2*+ 1 C 1 B 1 f \f £ 

J R 

<2* +1 Cifii||/ e 



by Cauchy-Schwartz 
by (15321 and (15171) 



J &'(R) N2 

<2 i ' +2 [C 1 B 1 ] 2 w(/?) 
Therefore, using (15.181 ) 

S 2 < 2 k+l [CiBi] 2 < 2 k+l [CiBi] 2 £ w{R) 

Re&'iQo) 

<C{B,) 2 2 k [w] A w{Q Q ) 

with some absolute constant C. But that is exactly the estimate (15.111 ). so Theorem l5.1l is proved. □ 
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